% Options for packages loaded elsewhere
\PassOptionsToPackage{unicode}{hyperref}
\PassOptionsToPackage{hyphens}{url}
%
\documentclass[
  10pt,
]{article}
\usepackage{lmodern}
\usepackage{amssymb,amsmath}
\usepackage{ifxetex,ifluatex}
\ifnum 0\ifxetex 1\fi\ifluatex 1\fi=0 % if pdftex
  \usepackage[T1]{fontenc}
  \usepackage[utf8]{inputenc}
  \usepackage{textcomp} % provide euro and other symbols
\else % if luatex or xetex
  \usepackage{unicode-math}
  \defaultfontfeatures{Scale=MatchLowercase}
  \defaultfontfeatures[\rmfamily]{Ligatures=TeX,Scale=1}
\fi
% Use upquote if available, for straight quotes in verbatim environments
\IfFileExists{upquote.sty}{\usepackage{upquote}}{}
\IfFileExists{microtype.sty}{% use microtype if available
  \usepackage[]{microtype}
  \UseMicrotypeSet[protrusion]{basicmath} % disable protrusion for tt fonts
}{}
\makeatletter
\@ifundefined{KOMAClassName}{% if non-KOMA class
  \IfFileExists{parskip.sty}{%
    \usepackage{parskip}
  }{% else
    \setlength{\parindent}{0pt}
    \setlength{\parskip}{6pt plus 2pt minus 1pt}}
}{% if KOMA class
  \KOMAoptions{parskip=half}}
\makeatother
\usepackage{xcolor}
\IfFileExists{xurl.sty}{\usepackage{xurl}}{} % add URL line breaks if available
\IfFileExists{bookmark.sty}{\usepackage{bookmark}}{\usepackage{hyperref}}
\hypersetup{
  pdftitle={S1 Appendix: Does Suffering Suffice? An Experimental Assessment of Desert Retributivism},
  hidelinks,
  pdfcreator={LaTeX via pandoc}}
\urlstyle{same} % disable monospaced font for URLs
\usepackage[top=0.85in,footskip=0.75in]{geometry}
\usepackage{longtable,booktabs}
% Correct order of tables after \paragraph or \subparagraph
\usepackage{etoolbox}
\makeatletter
\patchcmd\longtable{\par}{\if@noskipsec\mbox{}\fi\par}{}{}
\makeatother
% Allow footnotes in longtable head/foot
\IfFileExists{footnotehyper.sty}{\usepackage{footnotehyper}}{\usepackage{footnote}}
\makesavenoteenv{longtable}
\usepackage{graphicx,grffile}
\makeatletter
\def\maxwidth{\ifdim\Gin@nat@width>\linewidth\linewidth\else\Gin@nat@width\fi}
\def\maxheight{\ifdim\Gin@nat@height>\textheight\textheight\else\Gin@nat@height\fi}
\makeatother
% Scale images if necessary, so that they will not overflow the page
% margins by default, and it is still possible to overwrite the defaults
% using explicit options in \includegraphics[width, height, ...]{}
\setkeys{Gin}{width=\maxwidth,height=\maxheight,keepaspectratio}
% Set default figure placement to htbp
\makeatletter
\def\fps@figure{htbp}
\makeatother
\setlength{\emergencystretch}{3em} % prevent overfull lines
\providecommand{\tightlist}{%
  \setlength{\itemsep}{0pt}\setlength{\parskip}{0pt}}
\setcounter{secnumdepth}{-\maxdimen} % remove section numbering
\usepackage{color}
\usepackage{caption}
\usepackage{float}
\usepackage{dcolumn}
\usepackage{tabu}
\usepackage{booktabs}
\usepackage{longtable}
\usepackage{array}
\usepackage{multirow}
\usepackage{wrapfig}
\usepackage{float}
\usepackage{colortbl}
\usepackage{pdflscape}
\usepackage{tabu}
\usepackage{threeparttable}
\usepackage{threeparttablex}
\usepackage[normalem]{ulem}
\usepackage{makecell}
\usepackage{xcolor}

\title{S1 Appendix: Does Suffering Suffice? An Experimental Assessment of Desert Retributivism}
\author{}
\date{\vspace{-2.5em}}

\begin{document}
\maketitle

\newcommand*{\secref}[1]{Section~\ref{#1}}

\setcounter{table}{0}
\renewcommand{\thetable}{\Alph{table}}
\renewcommand{\figurename}{Table}

\setcounter{figure}{0}
\renewcommand\thefigure{\Alph{figure}}
\renewcommand{\figurename}{Fig}

\clearpage

\vspace{4cm}

\tableofcontents

\clearpage

\hypertarget{summary-statistics}{%
\section{Summary statistics}\label{summary-statistics}}

Table \ref{tableA} displays summary statistics for the numeric variables. For the categorical variables we refer to the graphs in the manuscript.

\begin{table}[!ht] \centering 
  \caption{Summary stats for age and sex} 
  \label{tableA} 
\footnotesize 
\begin{tabular}{@{\extracolsep{.2pt}}lccccccc} 
\\[-1.8ex]\hline 
\hline \\[-1.8ex] 
Statistic & \multicolumn{1}{c}{N} & \multicolumn{1}{c}{Mean} & \multicolumn{1}{c}{St. Dev.} & \multicolumn{1}{c}{Min} & \multicolumn{1}{c}{Pctl(25)} & \multicolumn{1}{c}{Pctl(75)} & \multicolumn{1}{c}{Max} \\ 
\hline \\[-1.8ex] 
age & 881 & 38.90 & 12.04 & 18 & 30 & 47 & 73 \\ 
sex & 881 & 0.50 & 0.50 & 0 & 0 & 1 & 1 \\ 
\hline \\[-1.8ex] 
\end{tabular} 
\end{table}

\hypertarget{balance-statistics}{%
\section{Balance statistics}\label{balance-statistics}}

Table \ref{tab:tableB} provides balance statistics for sex and age.

\begin{table}

\caption{\label{tab:tableB}Balance Statistics: Sex, Age}
\centering
\fontsize{9}{11}\selectfont
\begin{tabular}[t]{lrrr}
\toprule
  & Sex (Mean) & Age (Mean) & N (total)\\
\midrule
Happy (No Moral Change) & 0.50 & 39.59 & 147\\
Happy (Yes Moral Change) & 0.53 & 38.98 & 150\\
Neutral (No Moral Change) & 0.48 & 37.80 & 137\\
Neutral (Yes Moral Change) & 0.54 & 38.65 & 138\\
Unhappy (No Moral Change) & 0.48 & 39.55 & 133\\
\addlinespace
Unhappy (Yes Moral Change) & 0.46 & 38.84 & 176\\
\bottomrule
\end{tabular}
\end{table}

\hypertarget{perceived-justice-distribution}{%
\section{Perceived justice: Distribution}\label{perceived-justice-distribution}}

\begin{figure}[H]
\centering
\caption{Perceived justice: Distribution}\label{figA}
        \includegraphics[width=0.7\linewidth]{figA.pdf}
\begin{flushleft}
\end{flushleft}
\end{figure}

\hypertarget{sec:crowdcoding}{%
\section{Crowd-coding of open-ended responses}\label{sec:crowdcoding}}

In total \texttt{119} mechanical turk workers participated in our crowd-sourcing task to classify responses to our open-ended question on aims of punishment. Table \ref{tab:tableC} provides some statistics on the crowdsourcing task. We had \texttt{881} responses. The idea was to classify each response by 4 raters which would result in a total number of \texttt{3524} assignments. In the end our data comprised \texttt{3466} analyzable assignments. We crowd-sourced the data in 5 batches in order to be able to assess the rating quality and other statistics along the way. As suggested by {[}1{]} we tried to pay workers above the minimum wage of 7.25\$. On average our workers recieved a wage of 7.42 \$ per hour. Depending on their speed their wage may vary. Mechanical turk workers that were accepted for our task needed to be located in the U.S., have a HIT Approval Rate (\%) for all Requesters' HITs greater than 97\%, have a number of HITs Approved greater than 1000 and needed to have `Masters' granted. Masters are elite groups of Workers who have demonstrated accuracy on specific types of HITs on the Mechanical Turk marketplace. We added Masters requirement after Batch 1 and noticed a considerable increase in response quality.\\
The crowd-sourcing task is depicted in Figure \ref{fig-mechanicalturk-aims}. We provided raters with a set of possible aims of punishment and asked them to classify the responses regarding whether certain aims were mentioned or implied by a respondent's answer. For this task we did not randomize the ranking of the categories since we wanted raters to get used to the classification interface.

\begin{table}

\caption{\label{tab:tableC}Crowdsourcing stats}
\centering
\fontsize{9}{11}\selectfont
\begin{tabular}[t]{lr}
\toprule
Statistic & Value\\
\midrule
Time minimum (minutes) & 0.03\\
Time maximum (minutes) & 5.93\\
Average time per assignment (minutes) & 1.65\\
Total time (minutes) & 5731.23\\
Total time (hours) & 95.52\\
\addlinespace
Average pay per assignment (cent) & 20.46\\
\bottomrule
\end{tabular}
\end{table}

\begin{table}

\caption{\label{tab:tableD}Interrater-reliability: Krippendorf's alpha}
\centering
\fontsize{9}{11}\selectfont
\begin{tabular}[t]{lrrr}
\toprule
Category & Alpha & Responses & Raters\\
\midrule
Suffering & 0.49 & 881 & 119\\
Deterrence & 0.64 & 881 & 119\\
Reintegration & 0.36 & 881 & 119\\
Rehabilitation & 0.74 & 881 & 119\\
Amends & 0.38 & 881 & 119\\
\addlinespace
Vengeance & 0.33 & 881 & 119\\
Awareness & 0.63 & 881 & 119\\
\bottomrule
\end{tabular}
\end{table}

Since not all raters coded all responses we use Krippendorf's alpha as a measure of interrater reliability {[}2{]}. We calculated alpha for each of the 7 categories into which raters could categorize a response. The results are depicted in Table \ref{tab:tableD}. Krippendorf's Alpha ranges from 0.33 to 0.74, i.e., we get categories for which it is relatively satisfying, e.g., rehabilitation, and categories for which is less satisfying, e.g., vengeance.

\begin{figure}[H]
\centering
\caption{Crowd-coding of responses: Aims of punishment}\label{fig-mechanicalturk-aims}
        \includegraphics[width=1\linewidth]{figB.pdf}
\begin{flushleft}
\end{flushleft}
\end{figure}
\vspace{-0.75cm}

For the main analysis in the paper we chose a conservative strategy. We only coded a response as belonging to a category such as ``suffering'' when at least 3 out of 4 raters agreed that it belonged to that particular category. This is a rather strict cutoff and could mean that we underestimate the prevalence of certain aims in the responses. However, we assume that any such underestimation is relatively constant across aims, hence, it shouldn't affect our conclusions about Hypothesis 1.\\
Crowd-coding is both hailed as a useful strategy but also viewed critically {[}3--6{]}. Because Krippendorf's alpha was not higher for certain categories we carried out additonal analyses to see whether our results remain robust to the exclusion of certain workers.
Some workers may take the task less seriously than others which leads to measurement error. Below we excluded the codings of workers that finished the assignments in an average time lower than 0.3 minutes, or longer than 5 minutes as well as coded only 1 response. Extremely low average times may reflect superficial codings. Very long times may indicate that workers worked on several parallel assignments and only finished them once the time ran out. Furthermore, we assume that the quality of coding may improve once workers get used to the coding scheme. The results for this rater subsample are depicted in Table \ref{tab:tableE} and Table \ref{tab:tableF}. Krippendorf's alpha slightly increase for most of the categories. However, the main findings, namely the comparably low share of responses mentioning suffering as aim of punishment, does not change. In Table \ref{tab:tableF} the share of responses that are classified as mentioning the aim of suffering is even lower than before the exclusion of certain raters.

\begin{table}

\caption{\label{tab:tableE}Interrater-reliability: Krippendorf's alpha}
\centering
\fontsize{9}{11}\selectfont
\begin{tabular}[t]{lrrr}
\toprule
Category & Alpha & Responses & Raters\\
\midrule
Suffering & 0.49 & 881 & 67\\
Deterrence & 0.67 & 881 & 67\\
Reintegration & 0.36 & 881 & 67\\
Rehabilitation & 0.76 & 881 & 67\\
Amends & 0.42 & 881 & 67\\
\addlinespace
Vengeance & 0.39 & 881 & 67\\
Awareness & 0.65 & 881 & 67\\
\bottomrule
\end{tabular}
\end{table}

\begin{table}

\caption{\label{tab:tableF}Share of open-ended answers that mention particular aims}
\centering
\begin{tabular}[t]{lrr}
\toprule
  & \% responses & N responses\\
\midrule
Mention aim of suffering & 7 & 63\\
Mention aim of deterrence & 24 & 212\\
Mention aim of reintegration & 1 & 13\\
Mention aim of rehabilitation & 24 & 208\\
Mention aim of amends & 3 & 27\\
\addlinespace
Mention aim of vengeance & 2 & 20\\
Mention aim of awareness & 15 & 135\\
\bottomrule
\end{tabular}
\end{table}

Finally, while Table \ref{tab:tableF} depicts the prevalence of certain aims across all respondents, Table \ref{tab:tableG} depicts the prevalence of certain aims of punishment split across treatment groups. In other words, since we collected the data to test H1 after our survey experiment we could be worried that the considerations queried through the open-ended question are affected by our survey experiment. Table \ref{tab:tableG} allows us to explore whether participants's open-ended answers seem to have been influenced by our experimental treatments, i.e., by our experiment. While there are some differences these do not seem to be strong enough to be problematic for a test of Hypothesis 1.

\begin{table}

\caption{\label{tab:tableG}Share of open-ended responses that mention particular justifications/aims across treatment groups}
\centering
\fontsize{7}{9}\selectfont
\begin{tabu} to \linewidth {>{\raggedright}X>{\raggedleft}X>{\raggedleft}X>{\raggedleft}X>{\raggedleft}X>{\raggedleft}X>{\raggedleft}X>{\raggedleft}X}
\toprule
Treatment & Mention suffering  (\%)   & Mention deterrence  (\%)   & Mention reintegration  (\%)   & Mention rehabilitation  (\%)   & Mention amends  (\%)   & Mention vengeance  (\%)   & Mention awareness  (\%)  \\
\midrule
happy\_nomoralch & 8 & 20 & 1 & 22 & 3 & 1 & 18\\
happy\_yesmoralch & 7 & 25 & 1 & 22 & 3 & 2 & 9\\
neutral\_nomoralch & 8 & 20 & 1 & 19 & 2 & 1 & 17\\
neutral\_yesmoralch & 6 & 23 & 1 & 26 & 4 & 7 & 16\\
unhappy\_nomoralch & 5 & 22 & 0 & 27 & 5 & 1 & 19\\
\addlinespace
unhappy\_yesmoralch & 8 & 32 & 4 & 25 & 2 & 2 & 14\\
\bottomrule
\end{tabu}
\end{table}

\hypertarget{sec:anova}{%
\section{Analysis of variance}\label{sec:anova}}

In addition to the comparisons and models estimated in our `Results' Section we carried out classical ANOVA analyses. Figure \ref{figC} displays the averages across all treatment groups. Figure \ref{figD} displays the averages in the treatment groups with samples being split according to values of our two treatment variables --- Suffering and Moral Change --- independently from the respective other variable. The actual data was spread out using jitter.

\begin{figure}[!ht]
\centering
\caption{Means and distributions across all treatment groups}\label{figC}
        \includegraphics[width=.9\linewidth]{figC.pdf}
\begin{flushleft}
\end{flushleft}
\end{figure}

\begin{figure}[!ht]
\centering
\caption{Means and distributions for sample split according to the two treatment variables}\label{figD}
        \includegraphics[width=.9\linewidth]{figD.pdf}
\begin{flushleft}
\end{flushleft}
\end{figure}

One-way ANOVA tests yield significant p-values for groups means for both the Suffering treatment (P-value = 0.033) and Moral Change treatment (P-value = 2.15e-09) indicating that some of the group means are different. While there are only two subsamples (groups or values) for Moral Change, we don't know which combinations of the three Suffering subsamples (groups or values) display statistically significant differences. One-way ANOVA tests splitting the sample into groups corresponding to the 6 treatment groups yield the same result.\\
In a next step we perform multiple pairwise-comparison computing Tukey Honest Significant Differences {[}7{]}, to determine if the mean difference between specific pairs of groups are statistically significant. We find that there is a highly statistically significant difference comparing the Moral Change treatments (``no'' vs.~``yes''). The difference lies at 1.2 (P-value = 0.00). For Suffering there is a significant difference of -0.63 when we compare the ``unhappy'' to the ``happy'' category (P-value = 0.02), i.e., the two extreme categories on this three-point scale. The differences between neutral-unhappy and happy-neutral are not statistically significant. ANOVA tests assume normally distributed data and homogeneous variance across groups. We checked the homogeneity of variance assumption relying on Levene's test {[}8{]}. The test indicates a violation for groups of Moral Change but not for groups of Suffering. For this reason we compute a non-parametric alternative to the one-way ANOVA test, namely the Kruskal-Wallis rank sum test {[}9{]}. The results from the rank sum test indicate that there are significant differences between our treatment groups for our two treatment variables Moral Change and Suffering. These results reflect the findings from our main analysis. For this reason we refer the reader back to the `Results' Section in the main paper.

\clearpage

\hypertarget{contrasting-open-ended-ranking-and-classic-retributivism-scale}{%
\section{Contrasting open-ended, ranking and classic retributivism scale}\label{contrasting-open-ended-ranking-and-classic-retributivism-scale}}

Table \ref{tab:tableH} displays the open-ended responses after they have been classified according to whether they mentioned particular aims of punishment. However, as opposed to Table 1 in the main paper we now show the marginal distributions for different values on the retributivism scale that has 11 values. Specifically, we show those distributions for respondents with low values on the scale (0-3) and for respondents with high values on the scale (7-10). As was to be expected the share of respondents that mention suffering as an aim in their open-ended response is higher among those that also picked high values on the retributivsm scale. Nontheless, those shares are lower than one would expect. To some extent this is certainly related to the way we coded those open-ended responses. However, even if those values would vary because of a different coding scheme, the numbers would still be in the lower range. Further below we contrast the explicit retributivism scale with the ranking question on aims of punishment.

\begin{table}[!h]

\caption{\label{tab:tableH}Share of open-ended answers that mention particular aims for particular reponses on the closed retributivist scale}
\centering
\fontsize{7}{9}\selectfont
\begin{tabu} to \linewidth {>{\raggedright}X>{\raggedright}X>{\raggedright}X>{\raggedright}X>{\raggedright}X>{\raggedright}X>{\raggedright}X>{\raggedright}X>{\raggedright}X}
\toprule
  & 0 & 1 & 2 & 3 & 7 & 8 & 9 & 10\\
\midrule
Mention aim of suffering & 0 (0) & 0 (0) & 0 (0) & 0 (0) & 0.07 (10) & 0.07 (9) & 0.17 (15) & 0.14 (20)\\
Mention aim of deterrence & 0.16 (5) & 0.12 (3) & 0.28 (11) & 0.26 (16) & 0.26 (38) & 0.23 (31) & 0.25 (22) & 0.24 (35)\\
Mention aim of reintegration & 0 (0) & 0 (0) & 0.05 (2) & 0.03 (2) & 0.01 (1) & 0.02 (3) & 0.01 (1) & 0 (0)\\
Mention aim of rehabilitation & 0.39 (12) & 0.32 (8) & 0.31 (12) & 0.33 (20) & 0.24 (35) & 0.17 (23) & 0.17 (15) & 0.12 (18)\\
Mention aim of amends & 0.03 (1) & 0.08 (2) & 0 (0) & 0.03 (2) & 0.03 (5) & 0.02 (3) & 0 (0) & 0.01 (2)\\
\addlinespace
Mention aim of vengeance & 0.03 (1) & 0.08 (2) & 0.03 (1) & 0 (0) & 0.03 (5) & 0.02 (3) & 0.05 (4) & 0.01 (1)\\
Mention aim of awareness & 0.1 (3) & 0.08 (2) & 0.15 (6) & 0.2 (12) & 0.15 (22) & 0.22 (29) & 0.11 (10) & 0.12 (17)\\
\bottomrule
\end{tabu}
\end{table}

Figure \ref{figE} visualizes the results of the ranking question that provides respondents with a pre-defined choice set of aims of punishment. However, now we visualize those rankings for subsets of participants that picked particular values on the retributivism scale, either low values (0-3) or high values (7-10). Again we can observe that respondents that pick high values on the retributivism scale more often rank the aim of desert first. However, by far not everyone does. For instance, across both low and high values of the classic retributivism scale a large share of people rank the aim of deterrence in the first place. In other words, when contrasted with the classic retributivism scale both our open-ended measure and our ranking measure reveal that while there is overlap, there is also considerable variation behind the same value on this scale.

\begin{figure}[!ht]
\centering
\caption{Rankings of aims for different values on the redistributive scale}\label{figE}
        \includegraphics[width=1\linewidth]{figE.pdf}
\begin{flushleft}
\end{flushleft}
\end{figure}

\clearpage

\hypertarget{r-session-info}{%
\section{R session info}\label{r-session-info}}

\begin{verbatim}
## R version 3.6.2 (2019-12-12)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 17763)
## 
## Matrix products: default
## 
## attached base packages:
## [1] grid      stats     graphics  grDevices utils     datasets  methods  
## [8] base     
## 
## other attached packages:
##  [1] purrr_0.3.3      gridExtra_2.3    ggpubr_0.2.4     magrittr_1.5    
##  [5] irr_0.84.1       lpSolve_5.6.13.3 tidyr_1.0.0      kableExtra_1.1.0
##  [9] xtable_1.8-4     stringr_1.4.0    readr_1.3.1      stargazer_5.2.2 
## [13] dplyr_0.8.5      plotly_4.9.1     ggplot2_3.2.1    haven_2.2.0     
## [17] knitr_1.28      
## 
## loaded via a namespace (and not attached):
##  [1] tidyselect_0.2.5  xfun_0.13         colorspace_1.4-1  vctrs_0.2.4      
##  [5] htmltools_0.4.0   viridisLite_0.3.0 yaml_2.2.1        rlang_0.4.5      
##  [9] pillar_1.4.3      glue_1.4.0        withr_2.1.2       lifecycle_0.2.0  
## [13] munsell_0.5.0     ggsignif_0.6.0    gtable_0.3.0      rvest_0.3.5      
## [17] htmlwidgets_1.5.1 evaluate_0.14     labeling_0.3      forcats_0.4.0    
## [21] Rcpp_1.0.4.6      scales_1.1.0      webshot_0.5.2     jsonlite_1.6.1   
## [25] farver_2.0.1      hms_0.5.2         digest_0.6.25     stringi_1.4.6    
## [29] bookdown_0.18     tools_3.6.2       lazyeval_0.2.2    tibble_2.1.3     
## [33] crayon_1.3.4      pkgconfig_2.0.3   ellipsis_0.3.0    data.table_1.12.8
## [37] xml2_1.3.2        assertthat_0.2.1  rmarkdown_2.1     httr_1.4.1       
## [41] rstudioapi_0.11   R6_2.4.1          compiler_3.6.2
\end{verbatim}

\clearpage

\hypertarget{references}{%
\section*{References}\label{references}}
\addcontentsline{toc}{section}{References}

\hypertarget{refs}{}
\leavevmode\hypertarget{ref-Williamson2016-az}{}%
1. Williamson V. On the ethics of crowdsourced research. PS Polit Sci Polit. Cambridge University Press; 2016;49: 77--81.

\leavevmode\hypertarget{ref-Krippendorff2013-lj}{}%
2. Krippendorff K. Content analysis: An introduction to its methodology. Sage; 2013.

\leavevmode\hypertarget{ref-Snow2008-gq}{}%
3. Snow R, O'Connor B, Jurafsky D, Ng. Cheap and fast---but is it good?: Evaluating non-expert. Proceedings of the conference on empirical methods in natural. Stroudsburg, PA, USA: Association for Computational Linguistics; 2008. pp. 254--263.

\leavevmode\hypertarget{ref-Benoit2016-gp}{}%
4. Benoit K, Conway D, Lauderdale BE and. Crowd-sourced text analysis: Reproducible and agile production. Am Polit Sci Rev. Cambridge University Press; 2016;110: 278--295.

\leavevmode\hypertarget{ref-Lind2017-fo}{}%
5. Lind F, Gruber M, Boomgaarden HG. Content analysis by the crowd: Assessing the usability of. Commun Methods Meas. 2017;11: 191--209.

\leavevmode\hypertarget{ref-Dreyfuss2018-im}{}%
6. Dreyfuss E, Barrett B, Newman LH. A bot panic hits amazon's mechanical turk. Wired. 2018;

\leavevmode\hypertarget{ref-Yandell2017-ww}{}%
7. Yandell B. Practical data analysis for designed experiments. Routledge; 2017.

\leavevmode\hypertarget{ref-Fox2016-it}{}%
8. Fox J. Applied regression analysis and generalized linear models. Sage; 2016.

\leavevmode\hypertarget{ref-Hollander1973-sp}{}%
9. Hollander M, Wolfe DA. Nonparametric statistical methods. John Wiley \& Sons; 1973.

\end{document}
